Five Two-Qubit Gates Are Necessary for Implementing Toffoli Gate 
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In this paper, we settle the long-standing open problem of the minimum cost of two-qubit gates 
for simulating a Toffoli gate. More precisely, we show that five two-qubit gates are necessary. Before 
our work, it is known that five gates are sufficient and only numerical evidences have been gathered, 
indicating that the five-gate implementation is necessary. The idea introduced here can also be used 
to solve the problem of optimal simulation of three-qubit control phase introduced by Deutsch in 
1989. 



Since quantum computation provides the possibility of 
solving certain problems which are believed to be infcasi- 
blc with a classical computer [lj-Q , a huge amount of ef- 
fort has been devoted to building functional and scalable 
quantum computers over the last two decades. Quan- 
tum logical circuit is the most popular model of quantum 
computer hardwares. In order to be a general purpose 
computational device, a quantum computer must imple- 
ment a small set of quantum logical gates [|| , which can 
universally serve as the basic building blocks of quantum 
circuits, in the same way as classical logical gates did for 
conventional digital circuits. It is quite natural to choose 
certain gates operating on a small number of qubits as 
the basic gates. 

Theoretically, any two-qubit gate that can create en- 
tanglement, like the controlled-NOT (CNOT) gate, to- 
gether with all single-qubit gates is universal [fj. It has 
also been experimentally demonstrated that two-qubit 
gates can be realized with high fidelity using the cur- 
rent technology, for example, two-qubit gate with su- 
perconducting quibts have been presented with fidelities 
higher than 90% Q. Finding more efficient ways to im- 
plement quantum gates may allow small-scale quantum 
computing tasks to be demonstrated on a shorter time 
scale. More precisely, it would be quite helpful for de- 
feating quantum decoherence to realize multi-qubit gates 
with the least number of possible basic gates. Thus an 
important problem is how to implement more-than-two- 
qubit gates using only two-qubit gates. Indeed, studying 
the minimum cost of two-qubit gates for simulating a 
multi-qubit gate is not only of theoretical importance, 
but also an experimental requirement: to accomplish a 
quantum algorithm, even in a small size, one has to im- 
plement a relatively high level of control over the multi- 
qubit quantum system. A lot of experimental effort has 
been devoted to demonstrating multi-qubit controlled- 
NOT gates in ion traps §1, linear optics 0, supercon- 
ductors and atoms fllj . 

The Toffoli gate is perhaps one of the most impor- 
tant quantum logical gates as it can universally realize 



classical reversible computation [12| , as well as universal 
quantum computation [l3[ with little extra help. It also 
plays a central role in quantum error-correction [ill \\a\ - 
17] . Recently, experimental implementation of the Toffoli 
gate has received considerable attention 
However, it remains still unknown what is the optimal 
simulation of the Toffoli gate by using bipartite quantum 
logical gates, which has been an important open problem 
explicitly listed in the influential textbook on quantum 
computation^. Here, we settle this problem by showing 
that five two-qubit gates are necessary and sufficient for 
implementing the Toffoli gate. A five two-qubit gates de- 
composition of the Toffoli gate was known long time ago, 
but before this work numerical evidences showing that 
the five-gate implementation is optimal have been found 
21-" 
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Our result gives, for the first time, a theoretical 
proof for the optimality beyond numerical evidences. 

The function of Toffoli gate is simply a three-qubit 
controlled NOT gate and can be intuitively explained as 
follows. The Toffoli gate is acting on three quantum bits, 
namely A,B, and C. Here A and B are control qubits, 
and C is the target qubit. Let us fix a computational ba- 
sis {|0), |1)} for each qubit. Upon an input \abc), the gate 
will output the states of A and B directly, and flip the 
system C only if both the states of A and B are 1. The 
Toffoli gate can be depicted by the following diagram, 
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It can also been written as the following version by con- 
sidering the output over the computational basis: 

Tabc = I — |H0)(110| - |111)(H1| + |110)(111| + |111)(110|. 

It is well known that the Toffoli gate is universal for the 
classical computation in the sense that all conventional 
boolean circuits can be built upon it in a reversible way. 
It was also proved to be universal for quantum computa- 
tion if the one-qubit Hadamard gate is provided as free 
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resource [13|. Furthermore, a series of works showed that 
the Toffoli gate is an indispensable ingredient in realizing 

Re- 



11, 14-1 



fault— tolerant quantum computation 
cently, a rapid progress has been made on implementing 
the Toffoli gate experimentally. The first experimental 
realization of the quantum Toffoli gate is presented in 
an ion trap quantum computer, in January 2009 at the 
University of Innsbruck, Austria Q . A new approach us- 
ing higher-dimensional Hilbert spaces was proposed [HI 
that enables us to simplify the implementation of the Tof- 
foli gat e in linear optics [9J and superconducting circuits 

13, 111, III. 

Due to its significance in quantum computing, the the- 
oretical pursuit of efficient implementation of the Toffoli 
gate using a sequence of single- and two-qubit gates has 
quite long history. It was well known that six CNOT 
gates are optimal when single-qubit unitary is provided 
as free resources 2l|, 26-28j. Then an interesting ques- 
tion naturally arises: how many general two-qubit gates 
rather than the CNOT are required to implement the 
Toffoli gate? This question has attracted many different 
researchers in the last two decades. In particular, Nielsen 
and Chuang explicitly listed it as an unsolved problem 
in their standard textbook on quantum computation [H 
(see page 213, Problem 4.4). What we know until now is 
that the Toffoli gate can be decomposed as a circuit con- 
sisting of five two-qubit gates, and numerical evidences 
have been gathered, i ndicati ng that the five-gate imple- 
" 2c 



mentation is optimal [21 



Here, we finally settle this 
problem and present a theoretical proof of the optimality. 

Let Vabc = Iabc — 2[111){111| with Iabc the identity 
operator on Hilbert space Ha <£> Hb <8> He- It is evident 
that Vabc = (Iab <8 H c )T A bc(Iab ® H c ), where H is 
the Hadamard gate given by 
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In other words, Vabc and the Toffoli gate Tabc are 
equivalent up to local unitary He- By absorbing He 
into any two-qubit gates acting on AC or BC, we can 
easily conclude that Vabc and Tabc require the same 
number of two-qubit gates to realize. Thus in the fol- 
lowing discussions, we will focus on the minimal cost of 
simulating Vabc using two-qubit gates. 

The gate Vabc is a real Hcrmitian matrix that is 
invariant under any permutation of subsystems A,B, 
and C. Thus it can be regarded MS cl controlled-gate 
with control on each qubit. Note that any bipartite 
unitary Uab acting on a qubit system A and a gen- 
eral system B is said to be a controlled-gate with con- 
trol on A if it can be decomposed into the form of 
Uab = \0a)(0a\®U o + \1 a )(1a\® U x . This simple obser- 
vation is helpful to reduce the number of cases we need 
to consider. 

Since Vabc is regarded as a thrce-qubit gate acting 
on ABC, any two-qubit gate used to implement Vabc 



can be simply classified into three types: ICab - the gate 
acting on the systems A and B, and likewise, JCbc, and 
Kac- Clearly, it is impossible that all two-qubit gates 
used to simulate Vabc belong to the same type. Fur- 
thermore, we can verify that only two two-qubit gates 
are not sufficient for the simulation of Vabc- To see this, 
one only needs to notice that UabUbc = Vabc implies 
that U bc is also a controlled gate with control system C. 
This leads us to contradiction by a routine calculation. 

Three observations arc quite helpful during our proof: 
i). Any two-dimensional two-qubit subspace contains 
some product state; ii). A two-qubit unitary Uab can 
be regarded as a controlled gate with control system A 
if the state of qubit A in Uab\0) A \y) B ^ s always |0) A 
for any state \y) B of system B; iii) Let U abU ac be 
a three-qubit unitary which can be regarded as a con- 
trolled gate between the bipartition A-BC with control 
system A. Then there exist Vbi,Vb2 and wci,wc2 being 
one-qubit gates on Hb and He such that UabUac = 
|0}(0| ®v B i ® wci + |1)(1| ® V B 2 ® W C 2- 

Observations i) and ii) are obvious. To see iii), we can 
assume Uac\0) a\^)c = I^aIVOc by moving the local 
unitary to the left of U A b- Then U abU ac\$) a \d) b\i) c = 
Uab\0) a\v) bWc- Note that the state of ^4's part of 
Uab \0) a \u)b i s always |0), which means that Uac is a 
controlled gate with control on A. Similarly, Uab is also 
a controlled gate with control on A. Hence the result 
follows. 

Now we show that three two-qubit gates are not suf- 
ficient to implement Vabc- We will achieve this goal 
by analysing all possible circuits consisting of three two- 
qubit gates. Due to the highly symmetric properties of 
Vabc, we only need to consider the following two cases: 
Case 1: These three gates belong to just two types. With- 
out any loss of generality (wlog), we can assume that two 
gates are of the type ICab and the third one is of the type 
K, bc , and the circuit is (note that the time goes from left 
to right) 



.4 
B 
C 



U AB2 



U BC 



Uab\ 



We only need to show there is no solution of the following 
equation 

U ab\U bcU ab2 = Vabc, 

where Uabi and Uab2 are of type ICab, and Use of 
type K-bc- Then Use must be a controlled gate with 
control on C by noticing that Use = U ab1 VabcU AB2 , 
where ' stands for the Hcrmitian conjugate. We can write 
Ubc = |0)(0|(g)/ B + |0)(0|(g)w s . A direct calculation leads 
us to the conclusion that Ia®wb and I — 2|11)(11| share 
the same set of eigenvalues (counting multiplicity) . That 
is impossible. 
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Case 2: Three gates belong to different types. Wlog, we 
can assume the circuit is 

UabUbcUac = Vabc- 

We know that UbcUac is a controlled gate with control 
bit C. As just discussed, we can obtain that Ubc is a 
controlled gate with control system C, so does Uac- Con- 
sequently, we can assert that I— 2|11)(11| is local unitary 
by figuring directly out the form of control unitary. That 
is again impossible. 

We can generalize this technique to show that the gate 
Vabc cannot implemented by any circuit consisting of 
four nonlocal two-qubit gates we do not count the number 
of one-qubit gates as they can be easily absorbed into 
relevant two-qubit gates. Again the symmetric property 
of Vabc enables us to consider only the following two 
cases: 

Case 1: Four gates belong to only two types, say K-ab 
and K-bc- Due to the symmetry of Vabc, we only need 
to show the following circuit cannot be Vabc, 



A 
D 
C 



Ubci 



U a 



B2 



Ubci 



U a 



Bl 



that is to show the following equation has no solution: 

U ABlU BC\U ABlU BC1 = Vabc ■ 

The proof detail of this case is given in appendix. 

Case 2: Each of three types contains least one of 
the four two-qubit gates. Again due to the symmetry 
of Vabc, we only need to deal with the following two 
subcases: 

Case 2.1: The circuit is represented by 
U acU ab\U bcU AB2 = Vabc ■ We can reduce this 
circuit to the circuit considered in Case 1 by observing 
that SabVabcSab = Vabc and 

{S abU ac S ab){S abU abi)U bc{U AB2S ab) = Vabc, 

where Sab is the swap gate on system T-La ®Hb given 
by S\xa)\vb) = \va)\xb) for any two states \x) and \y). 
Here we have employed the fact that SabUacSab a 
two-qubit gate acting on BC, SabUabi and Uab2Sab 
are two-qubit gates acting on AB. 

Case 2.2: The circuit is represented by 
U ab\U bcU acU AB2 = Vabc ■ We know that UbcUac 
is a controlled gate with control system C . Directly, 
we can obtain Ubc and Uac ar e controllcd-gates with 
with control on C . This leads us to the conclusion that 
/ — 2 1 1 1 ) (1 1 1 shares eigenvalues counting multiplicity 
with a local unitary, which means that the product of 
two eigenvalues of / — 2|11)(11| equals to the product of 
the other two. Impossible. 

We have shown that four two-qubit gates are not suf- 
ficient for simulating the Toffoli gate, which further im- 
plies that any circuit consisting of less than five two-qubit 



gates has a positive distance to the Toffoli gate since the 
set of three-qubit gates that can be implemented by using 
up to four two-qubit gates form a compact set; in other 
words, the Toffoli gate cannot be well approximated by 
such circuits. 

This above argument can also be used to show that the 
following two-qubit controlled phase gate (three-qubit 
quantum gates with two control systems and one tar- 
get qubit) introduced by Deutsch [29( can not be imple- 
mented by four two-qubit gates: 



Vo 



(l-e")|lll><lll|. 



where < 6 < 2n. Note that Vabc is the special case 
of 9 = 7T. Together with the result in (23[, we conclude 
that five two-qubit gates are optimal for simulating the 
the two-qubit controlled phase gate. 

In this paper, we study the problem of implementing 
multi-qubit gate using two-qubit unitaries. It is demon- 
strated that four two-qubit unitaries is not enough for 
constructing a three-qubit Toffoli gate, thus, five two- 
qubit gates is optimal. More precisely, our idea can 
be directly used to prove that in order to implement a 
three-qubit control phase gate, five two-qubit gates is also 
needed. We hope this work will be helpful for further de- 
termining minimal cost of implementing larger quantum 
logical gates, e.g. the multi-qubit controlled gate, and for 
studying optimization of quantum logical circuits, a cru- 
cial issue in the design and implementation of quantum 
computer hardware and architecture. 
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Technical appendix: In this appendix, we show that 
there is no unitarics U ab\,U AB2 and U B c\,U B c2 such 
that 

U ABlV BC\U ABlV BC2 = VaBC ■ 

Notice that Uab\Ubg\Uab2 is a controlled gate on the 
bipartition A — BC with control on A. Moreover, the A's 
part state of the output state Uab\Ubc\Uab2 V) aW) bc 
is still \i) A for any input state \i) A \ip) B c with i = 0, 1. 
Since Uab2 maps some state |0) A |£) B to product state, 
we can assume that Uab2 |0) a |0) b = |0) A |0) B by ab- 
sorbing onc-qubit gates into Ubci and Uabi- Then 
the state of A's part of Uab\Ubc\Uab2 \Q)a\®)b\ z )c ~ 
UabiUbci \Q)a\®)b\ z )c 1S stm |0)a- We now need to 



consider three subcases according to different forms of 
the state U BC i\0) B \v) c'- 

Case 1.1: There is some \z ) c such that Ubci\Q) B \ z o) c 
is entangled. Assume that there is < A < 1 such that 

£W|0) B |z o )c - >/A|0> B |a) o + s/T=\\i) B \a x ) c , 

where we have absorbed a local unitary acting on B into 
Uabi- Let |$) = U A bi\00) and |#) = U A bi\01), we know 
that 

\x)abc = UabUbc\0} a \0) b \zo)c 

= ^I^abHc + ^^I^abK)^ 
we can readily obtain 

XA = |0)(0| = A$ A + (1 - \)^ A $ A = *a = |0)(0|. 

Therefore, Uabi is a controlled gate with control system 
A, then one know that Uab2 = U bc1 U ab1 VabcU BC2 is 
a controlled gate with control A. Assume that Uabi = 
|0>(0|8>Jb + |1}<1|<8u b and U AB 2 = |0> <0| ®I B + 11)(1| 8> 
v B . We conclude that U B ci 1 'bU bc1 — u B ® |0)(0| + 
u B Z B ® where Z is the Pauli matrix given by 

Z|0) = |0) and Z\l) = 

The set of eigenvalues of U B c\v B U BC1 counting mul- 
tiplicity is {e" 101 , e %Bl , e 102 , e 102 }, which is also the eigen- 
values counting multiplicity of the right hand side of the 
above equality. Note that u B should not equal to identity 
up to a global phase. Then u B and u B Z B have the same 
set of eigenvalues. Thus their determinants are equal, 
say 

det(i4) = det(u B Z B ) = det(u B ) det(Z B ) = -det(u B ), 

and det(u B ) = 0. This contradicts the fact that u B is 
unitary. 

Thus U B ci\0) B \z) c is always product for any \z) c . 
This leads us to consider the following two subcases. 

Case 1.2: There is a \^) c and a local unitary w B on 
system B such that U B ci\0} B \z} c = w B \z) B \ r y} c . Then 
Uabi maps {|0) A } ® Hb to itself, hence Uabi is a con- 
trolled gate with control system A. Similarly V ' AB2 is 
also a controlled gate with the same control bit. The 
rest proof is the same as Case 1.1. 

Case 1.3: There is a state on system B, wlog, says 
|0) B , and a local unitary wc on system C such that 
^BCl|0) B |2) c = |0) B wc\z) c . Then U B ci is a controlled 
gate with control system B. By moving this wc into 
Ubc2, w e can assume that U B ci = |0)(0| ® Ic + 8) 
uc- Note that for any \z) c , part C's state of the out- 
put state \x)abc = V A b\Ubc\Uab2 \^) a \^)b\ z )c = 
Uabi |0) a |0) b |z) c is still \z) c . Recall that \x)abc = 
Vabc\0) a (U BC2 \0) b \z) c ) = \0) A (U BC2 \0) B \z) c ). Thus 
part C's state of U BC2 \0} B \z) c is \z) c for all \z) c € 
T~Lc i which means that there is \j3) B such that 
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UbC2\P) B \z) c — \Q) B \z) c . Therefore, one can find a uni- 
tary v c such that U B C2 = \0){P\® Ic + \1){P ± \® v c . In 
order to simplify the structure of the two-qubit gates, we 
observe that UbC2^ab2^bci^abi = Vabc, *- e -, hence 
also provides a simulation of Vabc- Now we consider the 
state 

U* BC3 U* AB2 Ul cl \0) c \0) B \x) A = U BC2 U Am \0) c \0) B \x) A 

for any \x) A . The argument of cases 1.1 and 1.2 excludes 
the following possibilities: (i) there is some \x) A such that 
Uab2\®)b\ x )a * s entangled, or (ii) there is a \5} A and a 
local unitary wb on system B such that U AB2 \0) B \x) A = 
w B \x) B \5) A . So the only possibility is that there is a state 
\4>) B on system B, and a local unitary w A on system 
A such that U AB2 \0) B \x) A = \<j>) B w A \x) A . According 
to U A B2 |0) a |0) b = |0) A |0) B , we can choose \<j>) = |0). 
Thus V ' AB2 is a controlled gate with control system B, 
i.e., U A B2 = |0)(0| ® w A + |1)(1| ® v A . By studying 



part C"s state of U ab\U bc\U AB2U BC2 10)^10)^1^),^ = 
|0) A |0) B |2) C ,, we see that \0) B defined in U B C2 equals 
to |0) B or |1} B , up to some global phase. Otherwise, as- 
sume that |0} B = a\/3) B + 6|/3- L ) B for ab ^ 0. Then the 
state of part C becomes a mixed state for general input 
|0) A |0) B |z) c since uq is not identity up to some global 
phase and U bci is nonlocal. For the case \j3) B = |0} B , we 
know that all the four two-qubit gates are controlled gate 
with control system B, which implies that I— 2|11)(11| is 
a local unitary, a contradiction. For the case \(3) B = |1) B , 
let X B be the NOT (flip) gate such that X\0) = |1) and 
X\l) = |0), then one can verify that 

{UabiXb)(XbUbciXb){XbU A B2Xb){XbUbc2) = Vabc- 

Then UabiXb,X b UbciXb,X b Uab2Xb and X b UbC2 
are all controlled gate with control system B. This also 
leads us to the impossible conclusion that I — 2|11)(11| 
is local. 



